SPEECH TRAINING DEVICES FOR PROFOUNDLY DEAF CHILDREN 



Lynne E. Bernstein, James B. Ferguson III and Moise H. Goldstein, Jr. 



Department of Electrical Engineering and Computer Science 
Johns Hopkins University 
Baltimore, Maryland, U.S.A. 



ABSTRACT 

Prelingually, profoundly deaf chil- 
dren have great difficulty achieving in- 
telligible speech. Even after intensive 
therapy, their speech is deficient in 
voice pitch, rhythm, stress and intona- 
tion, as well as segmental phonetic 
characteristics. To facilitate the 
speech training of these children, we are 
developing two interrelated personal com- 
puter (PC) based systems: a school sys- 
tem and a home system. In the school 
system, speech production is monitored by 
microphone, electroglottograph, and pneu- 
motachograph. The home system uses only 
microphone input. Both systems use video 
displays for providing feedback and rein- 
forcement. The school system allows di- 
agnosis, training by game playing, and 
specification of games to be played on 
the home system. The home system pro- 
vides directed speech practice between 
therapy sessions. 



PROJECT GOALS 

Speech training of prelingually, 
profoundly deaf children in typical 
preschool programs and in elementary 
schools for the deaf is in one-on-one 
sessions with a specialist. Speech and 
language training is required, because 
these functions will not develop spon- 
taneously in such children [11. However, 
even with therapy, these children typi- 
cally make little progress. Even though 
speech production may improve during a 
training session, improvement typically 
does not carry over to the next session. 
We believe that this poor progress is due 
to the short training periods in school, 
the lack of adequate analysis and train- 
ing tools in the hands of therapists, and 
inadequate opportunities for guided prac- 
tice by the child outside of the therapy 
sessions. In order to help ameliorate 
these problems, we are developing two 
PC-based, interrelated speech training 
systems, one for use by a specialist in a 
school or clinic, the other for the deaf 



child's home. 

Speech can be described in terms of 
its phonetic and prosodic characteris- 
tics. The profoundly deaf speaker is 
likely to be deficient in the production 
of both types of characteristics. The 
prosodic characteristics are closely re- 
lated to fundamental aspects of speech 
physiology and without control over pro- 
sody, even correctly articulated phonetic 
segments can be unintelligible [2] [3] [41. 
For this reason, many current training 
programs begin with work on prosodic 
characteristics [4][5J. In the first 
year of our project, just complete at 
this time, a sequence of games for train- 
ing certain prosodic characteristics of 
speech (i.e., sustained voicing, intensi- 
ty, and isolated syllables produced on 
one breath) have been developed and are 
described below. 

Present development is focused on 
voice pitch, its control, and detection 
of abnormal speech physiology. Research 
at Gallaudet College has demonstrated the 
usefulness of the electroglottograph 
(EGG) and the pneumotachograph (PTG) in 
diagnosing problems in voice production 
of deaf college students [6] [7] . The EGG 
provides a monitor of the opening and 
closing of the vocal folds, while the PTG 
monitors the volume velocity of expira- 
tion. These two signals in conjunction 
with the acoustic signal of vocalization 
can provide significant diagnostic power 
to detect abnormalities in voicing pro- 
duction . 

PC-BASED SYSTEMS 

Figure 1 shows the configuration of 
the school system. The home system is 
simpler, having only the microphone input 
and using an IBM PC/XT. Both the home 
and school system make use of high reso- 
lution color graphics and a digital sig- 
nal processing board employing the Texas 
Instruments TMS320 chip. 



ICASSP 86, TOKYO 



13. 6. 1 

CH2243-4/86/0000-0633 $1.00 © 1986 IEEE 



633 



-< PNEUMOTACHOGRAPH 
r < ELECTROGLOTTOGRAPH 
r < MICROPHONE 



Signal Conditioner 

Programmable 
gain/attenuator 
filters 

RMS Conversion 



PTG 



EGG 



MIC 



RMS 



A/D 
Converter 



Interval Timer 



< 



IBM PC/AT BUS 



High 
Resolution 
Color Monitor 



3E 



Digital 
Signal 
Processor 



Figure 1. System configuration. 

Line connections indicate analog signals. 

Bar connections indicate digital I/O. 



Hardware development has included 
design of an analog preprocessor. Pro- 
grammable preamplifiers and filters allow 
flexible conditioning of the input sig- 
nals. A data acquisition board follows 
with A/D conversion and data acquisition 
all under program control. 

The major engineering task in real- 
izing the system has been software 
development. Most of the software has 
been written in the C programming 
language. The overall aim of the en- 
gineering design is to have a very flexi- 
ble system, yet one that is easy for 
therapists and students to use. These 
aims have been met largely through exten- 
sive software control and through menus 
by which the user selects functions or 
games and sets parameters. 

SPEECH TRAINING GAMES 

The following is a brief description 
of some of the training games that have 
been written and are now in use. 

Sustained Vocalization 

The goal of the sustained vocaliza- 
tion game is to train the child to pro- 
duce sustained voicing for 2, 3, 4, 5, or 
6 s. Each duration can be presented as a 



separate level o 
tion required of 
vowel or conson 
the child can a 
easily. Figure 
lowing 4 trials 
voice activated 
filled during th 
vocalization. I 
the required le 
imated images i 
tions that are 
are not followed 
duration can b 
the program. 



f the game. The vocaliza- 
the child is typically a 
ant-vowel syllable that 
lready produce relatively 
2 shows the monitor fol- 
of the game. The game is 
and a color bar is 
e duration of the child's 
f that vocalization is 
ngth, one of a set of an- 
s presented. Vocaliza- 
intermittent or too short 
by animation. Required 
e changed at any point in 



Repeated SbXLLi Vocal izations 

The goal of this game is to produce 
in one breath a specified number (i.e., 
1-15) of short isolated syllables. Usu- 
ally one consonant-vowel combination will 
be chosen for repetition. The number of 
syllables and the rate of production (1, 
2, or 3 syllables per s) can be indepen- 
dently specified by the user. Figure 3a 
and 3b shows two graphics from the game. 
The game employs graphics of bird "foot 
prints" to represent the sequence of tar- 
get syllables. After each production of 
an isolated syllable, a foot print 
changes color, starting from the left of 
the screen and moving right. At the same 
time, a graphic of a bird is moved along 
above the foot prints (Figure 3a) . An 
animated worm moves across the screen 
beneath the foot prints at the specified 
rate of syllable production. If the 
child successfully completes the speci- 
fied number of isolated syllables before 
the worm reaches the end of the line of 
tracks, the bird picks up the worm and 
flies across the screen (Figure 3b) . No 
animation follows an inadequate attempt. 
Especially important for this exercise is 
that the child is required to interrupt 
voicing between each syllabic articulato- 
ry maneuver. Ongoing voicing without 
interruption will not move the bird 
across the screen. 

Tmitiediai-.e Feedback for intensity Cha n g e s, 



This exercise introd 
to a perceptual loudne 
mapped onto a vertical b 
three different color 
child vocalizes into a 
image of a balloon rises 
time next to the vertical 
is divided into regions 
conversational voice, and 



uces the child 
ss scale that is 
ar composed of 
blocks. As the 
microphone, the 
or falls in real 
bar. The scale 
for quiet voice, 
loud voice. 



intensity Game with star sll Scoxe Feed- 
back 

Once the child understands the loud- 
ness scale, a game is available to work 
further on the control of intensity. On 



13. 6. 2 



634 



ICASSP 86, TOKYO 



a random basis , a target intensity level 
is indicated, and the child must attempt 
to produce that level. A pointing hand 
indicates the target level, and after the 
child vocalizes a balloon rises to the 
level that was achieved. Figure 4 shows 
a graphic of the balloon at the level 
specified by the pointing hand. Correct 
response results in receiving a star or a 
numerical score, depending on the version 
of the game being run. In this game, the 
balloon does not rise to the intensity 
level achieved until 2 s of speech have 
been sampled. In this way, the child 
cannot use immediate feedback to adjust 
vocalization to the required intensity. 

Intensity Game with Limited Feedback 

In the games described above, a 
visual scale or display provides a direct 
analog to the child's vocilizations . One 
goal of speech training is that the child 
achieve vocal control independent of such 
feedback. A game was written to provide 
limited delayed feedback for control of 
vocal intensity. In this game, the color 
of blinking balloons held by a clown is 
used to signal the required loudness of 
vocalization. Trials are voice activated 
and sampling continues for 2 s. Success 
results in a star appearing over the bal- 
loon associated with the particular tri- 
al. Figure 5 is a graphic taken during 
the course of the game. Incorrect vocal- 
ization results in no change in the 
display, and the next balloon in the se- 
quence begins to blink. 

DISCUSSION 

A fundamental aspect of the project 
from the beginning has been the partici- 
pation of specialists in speech training, 
and profoundly deaf children. Our system 
is now being used at the Kendall Elemen- 
tary School of Gallaudet College and also 
in the laboratory at Johns Hopkins 
University. These games have received 
enthusiastic response from children ages 
3-5 and 9-11 years. 

One of the constant issues raised in 
the process of developing these systems 
is the effect of isolating specific 
speech characteristics for training. 
When, for example, training is focused on 
control of intensity, children's voice 
fundamental frequency tends to co-vary, 
as does voice quality. Further, it is 
possible for a speaker to achieve specif- 
ic acoustic characteristics using abnor- 
mal vocal maneuvers. For example, as a 
child attempts to increase vocal intensi- 
ty, he may produce breathy or harsh voice 
that may result from inadequate approxi- 
mation of the vocal folds. The dilemma 
is to provide specific training in a par- 
ticular voice or phonetic characteristic 



without inadvertently giving experience 
and reinforcement for inadequate or ab- 
normal performance of another charac- 
teristic. The resolution to this dilemma 
at this time seems to be multifold. In- 
clusion of tools for diagnosis and games 
using the EGG and PTG can help to correct 
abnormal physiology that might neverthe- 
less result in partially adequate acous- 
tic signals. Another partial solution is 
to produce a large number of games and/or 
diagnostic tools that depend on a wide 
variety of speech characteristics, both 
in isolation and combination. Of prime 
importance is the intervention of the in- 
dividual who works with the child. This 
person must make judgments about the ap- 
propriate points at which to introduce 
and withdraw training exercises. Our 
system is intended to be a significant 
aid to the speech-language pathologist or 
speech therapist working with deaf chil- 
dren. 

ACKNOWLEDGEMENTS 

This work is an Orphan Product 
Development Study supported by NIH/NINCDS 
(N01 NS-4-2372) . Dr. James Mahshie has 
provided considerable input to this pro- 
ject in all its stages. Our appreciation 
is extended to Ms. Betty Waddy-Smith, 
speech-language pathologist, and Ms. Di- 
ane Vari, speech therapist, and the deaf 
children in this project. Also, thanks 
are due to Mr. Silvio P. Eberhardt for 
his help in preparing this paper. 

REFERENCES 

[1] S. P. Quigley. "Effects of early 
hearing impairment on normal 
language development." In F. N. Mar- 
tin (Ed.), Pediatric Audiology . En- 
glewood Cliffs, NJ : Prentice-Hall, 
1978. 

12] K.N. Stevens, R. S. Nickerson and 
M.H. Rollins. "On describing the 
suprasegmental properties of the 
speech of deaf children." Bolt, 
Beranek and Newman Report 3955, 
1978. 

[3] J. M. Pickett. Hie. Sounds of Speech 
Communication . Baltimore, MD: 

University Park Press, 1980. 

[4] D. Ling. Speech and ths. Hearing - 
Impaired CJiiM: Theory and Practice . 
Washington, D.C.: The Alexander 
Graham Bell Association for the 
Deaf, Inc., 1976. 

[5] D. Ling. "Speech development in 
hearing-impaired children." Journal 
OL Communication Disorders , 1978, 
11, 119-124. 



13. 6. 3 



ICASSP 86, TOKYO 



635 



[6] A. Hasegawa, J. Mahshie, E. Herbert 
and J. M. Pickett. "Electroglotto- 
graphic, aerodynamic and acoustic 
study of normal and breathy phona- 
tion." Journal o_£ the Acoustical So- 
ci ety fif Ajn ec i ca , 1984, 25., 
S58(Abstract) . 

I7J J. Mahshie, A. Hasegawa, M. Mars. E, 
Herbert and p. Brandt. "Voice fun- 
damental frequency training of hear- 
ing impaired speakers," Journal o_f. 
t h e Acoustical Society , 1983, 13., 

S15 (Abstract) . 




Figure 2. Graphic from sustained vocali- 
zation game. White bars represent dura- 
tion of voicing. 




Figure 3a. Graphic from repeated short 
vocalization game. Bird moves one foot 
print for every syllable vocalized. 




Figure 3b. Graphic from repeated short 
vocalization game. Following successful 
sequence, bird flies across screen. 




Figure 4. Graphic from intensity game 
with feedback. Child must make balloon 
rise to the level indicated by the hand 
on the right of the screen. 




Figure 5. Graphic from intensity game 
with limited feedback. Child must produce 
intensity levels coded by the color 
of the balloons. 



636 



13. 



6. 4 



ICASSP 86, TOKYO 



